Pattern Matching Refinements to Dictionary-Based Code-Switching Point Detection
نویسندگان
چکیده
This study presents the development and evaluation of pattern matching refinements (PMRs) to automatic code switching point (CSP) detection. With all PMRs, evaluation showed an accuracy of 94.51%. This is an improvement to reported accuracy rates of dictionary-based approaches, which are in the range of 75.22%-76.26% (Yeong and Tan, 2010). In our experiments, a 100sentence Tagalog-English corpus was used as test bed. Analyses showed that the dictionary-based approach using part-ofspeech checking yielded an accuracy of 79.76% only, and two notable linguistic phenomena, (1) intra-word code-switching and (2) common words, were shown to have caused the low accuracy. The devised PMRs, namely: (1) common word exclusion, (2) common word identification, and (3) common n-gram pruning address this and showed improved accuracy. The work can be extended using audio files and machine learning with larger language resources.
منابع مشابه
Segmental Semi-Markov Models for Endpoint Detection in Plasma Etching
We investigate two statistical-detection problems, change-point detection and pattern matching in plasma etch endpoint detection. Our approach is based on a segmental semi-Markov model framework. In the change-point detection problem, the changepoint corresponds to state switching in the model. For pattern matching, the pattern is approximated as a sequence of linear segments which are then mod...
متن کاملHidden Markov Models for Endpoint Detection in Plasma Etch Processes
We investigate two statistical detection problems in plasma etch endpoint detection: change-point detection and pattern matching. Our approach is based on a segmental semi-Markov model framework. In the change-point detection problem, the change-point corresponds to state switching in the model. For pattern matching, the pattern is approximated as a sequence of linear segments that are modeled ...
متن کاملAlgebraic Matching of Vulnerabilities in a Low-Level Code
This paper explores the algebraic matching approach for detection of vulnerabilities in binary codes. The algebraic programming system is used for implementing this method. It is anticipated that models of vulnerabilities and programs to be verified are presented as behavior algebra and action language specifications. The methods of algebraic matching are based on rewriting rules and techniques...
متن کاملArea and Timing Analysis of Different PSU's in P-Match Algorithm for Data Compression in Cache Memories
Microprocessors speeds have been increasing faster than the speed of off-chip memory. In a multi-processor system, if the processor number is increased, then the access time of the memory is also high. Thus a 'wall' is raised between processor number and memory access time. When compared with on chip cache, to access the data, off-chip cache takes one order of magnitude more time. Off...
متن کاملاستفاده از نمایش پراکنده و همکاری دوربینها برای کاربردهای نظارت بینایی
With the growth of demand for security and safety, video-based surveillance systems have been employed in a large number of rural and urban areas. The problem of such systems lies in the detection of patterns of behaviors in a dataset that do not conform to normal behaviors. Recently, for behavior classification and abnormal behavior detection, the sparse representation approach is used. In thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012